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Abstract 

We present a quantum extension of a version of Sanov's theorem fo- 
cussing on a hypothesis testing aspect of the theorem: There exists a 
sequence of typical subspaces for a given set ^ of stationary quantum 
product states asymptotically separating them from another fixed sta- 
tionary product state. Analogously to the classical case, the exponential 
separating rate is equal to the infimum of the quantum relative entropy 
with respect to the quantum reference state over the set \t. However, while 
in the classical case the separating subsets can be chosen universal, in the 
sense that they depend only on the chosen set of i.i.d. processes, in the 
quantum case the choice of the separating subspaces depends additionally 
on the reference state. 
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1 Introduction 



In this article we present a natural quantum version of the classical Sanov's 
theorem as part of our attempt to explore basic concepts and results at the 
interface of classical information theory and stochastics from the point of view 
of quantum information theory. 

Among those classical results a crucial role plays the Shannon- McMillan- 
Breiman theorem (SMB theorem) which clarifies the concept of typical subsets, 
yielding the rigorous background for asymptotically optimal lossless data com- 
pression. It says that a long n-block message string from an ergodic data source 
belongs most likely to a typical subset (of the generally very much larger set 
of all possible messages). The cardinality within the sequence of these typical 
sets grows with the length n of the message at an exponential rate given by the 
Shannon entropy rate of the data source. As a general rule, when passing to the 
quantum situation, the notion of a typical subset has to be replaced by that of 
a typical subspace (of an entire Hilbert space describing the pure n-block states 
of the quantum data source), with the dimension of that subspace being the 
quantity growing exponentially fast at a rate given now by the von Neumann 
entropy rate as n goes to infinity (cf. m in the i.i.d. situation, £Q, [2] in the 
general ergodic case). 

We know from the classical situation that typical subsets have even more strik- 
ing properties, when chosen in the right way: For a given alphabet A and a 
given entropy rate there is (a bit surprisingly) a universal sequence of typical 
subsets growing at the given rate for all ergodic sources which do not top the 
given entropy rate (this result has been generalized to the quantum context by 
Kaltchenko and Yang Moreover, for any ergodic data source P we can find 
a sequence of typical subsets growing at the rate given by the entropy and at 
the same time separating it exponentially well from any i.i.d. (reference) data 
source Q in the sense that the Q-probability of the entire P-typical subset goes 
to zero at an exponential rate given by the relative entropy rate h(P,Q). Fur- 
thermore, the relative entropy is the best achievable (optimal) separation rate. 
This assertion which gives an operational interpretation of the relative entropy 
is Stein's lemma. We mention that the i.i.d. condition concerning the reference 
source cannot be weakened too much, since there are examples where even the 
relative entropy has no asymptotic rate, though the reference source is very well 
mixing (B-process, cf. |15) ). A quantum generalization of this result can be 
found in for the case that both sources are i.i.d., and in |S] for the case of a 
general ergodic quantum information source. This result was mainly inspired by 
|10j . where complete ergodicity was assumed and optimality was still left open. 

From the viewpoint of information theory or statistical hypothesis testing the 
essential assertion of Sanov 's theorem is that it represents a universal version of 
Stein's lemma by saying that for a set f2 of i.i.d. sources there exists a common 
choice of the typical set such that the probability with respect to the i.i.d. 
reference source Q goes to zero at a rate given by infp 6 n h(P, Q). 
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Originally Sanov's theorem is of course a result on large deviations of empirical 
distributions (cf. |14| . [S]). It is the information-theoretical viewpoint taken here 
which suggests to look at it as a large deviation principle for typical subsets. 
With the main topic of this paper being a quantum theorem of Sanov type, it 
is especially appealing to shift the focus from empirical distributions to typical 
subspaccs, since the notion of an individual quantum message string is at least 
problematic, and as will be seen by an example, a reasonable attempt to define 
something like a quantum empirical distributions via partial traces leads to a 
separation rate worse than the relative entropy rate (see the last section) . 

Another aspect of the classical Sanov result has to be modified for the quantum 
situation: The typical subspace will no longer be universal for all i.i.d. reference 
sources, but has to be chosen in dependence of the reference source. So only 'one 
half of universality is maintained when passing to quantum sources, namely 
that which refers to the set f2. This will be demonstrated by an example in the 
last section. The basic mechanism behinde this no go result is - heuristically 
speaking: In the quantum setting even pure states cannot be distinguished with 
certainty, while classical letters can. In our forthcoming paper 0] we extend the 
results given here to the case where only stationarity is assumed for the states 
in f2. 

2 A quantum version of Sanov's theorem 

Let A be a finite set with cardinality #A = d. By V(A) we denote the set of 
probability distributions on A. The relative entropy H (P, Q) of a probability 
distribution P 6 V{A) with respect to a distribution Q is defined as usual: 

H (PQ):=I T,aeA P (*)( l °&P(a)-l°gQ(a>)), ^ P « Q (1) 
[ oo, otherwise, 

where log denotes the base 2 logarithm. For the base e logarithm we use the 
notation In. The function H(-, Q) is continuous on V{A) 1 if the reference distri- 
bution Q has full support A. Otherwise it is lower semi-continuous. The relative 
entropy distance from the reference distribution Q to a subset f2 £ P(A) is given 
by: 

H(Cl,Q) := M n H(P,Q). (2) 

Our starting point is the classical Sanov's theorem formulated from the point 
of view of hypothesis testing: 

Theorem 2.1 (Sanov's Theorem) Let Q E V(A) and O C V{A). There ex- 
ists a sequence {M n } ng N of subsets M n C A n with 

lim P n (M n ) = 1, VP e n, (3) 
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such that 



lim -logQ"(M„) = -H(Q, Q). 



(4) 



Moreover, for each sequence of sets {M n } fulfilling ^) we have 



liminf- log Q"(M„) > -H(Sl,Q), 



such that H(Q, Q) is the best achievable separation rate. 

We emphasize that in the above formulation we omitted the assertion that the 
sets M n can be chosen independently from the reference distribution Q. How- 
ever, as will be shown in the last section, in the quantum case this universality 
feature is not valid any longer and Theorem 12.11 is the strongest version that 
has a quantum analogue. It is an immediate consequence of Lemma H2.3|) and 
is related to the usual formulation of Sanov's theorem (^1], see also Theorem 
3.2.21 in [H]) in terms of empirical measures P x n := i Y27=i ^ ^ or sequences 
x n := {x\, . . . , x n } as follows: By the strong law of large numbers, the sequence 
of empirical distributions {P x ™} formed along an i.i.d. sequence {x\,X2, ■■■} of 
letters distributed according to a probability measure P G V{A) tends to P 
almost surely. Hence for any neighbourhood U of P we have the limit re- 
lation linin^oo P n ({x n : P x ™ £ U}) = 1 meaning that the sequence of sets 
{x n : P x n <E U} is typical for P n . If fl is an open set, then we may choose U as 
fl and {x 11 : P x n g U} is universally typical for all P" , P S f2. Now Sanov's the- 
orem in its traditional form says that — ^\ogQ n ({x n : P x n G £1}) — ► H(Q 7 Q). 
So it says that the (explicitely specified) typical sets separate Q n exponentially 
fast from all P n , P G fl with the given order H (fl, Q). 

Passing to the quantum setting we substitute the set A by a C* -algebra A with 
dimension dirndl = d < oo and the cartesian product A n := H™=i ^ by the 
tensor product A^ := ®,"=i A- We denote by S(A) the set of quantum states 
on the algebra of observables A, i.e. S(A) is the set of positive functionals cp 
on A fullfilling the normalisation condition ip(l) = 1. For tp G S(A) we mean 
by ip® n a product state on A^ . The quantum relative entropy S(ip, tp) of the 
state ip G S(A) with respect to the reference state tp G S(A) is defined by: 



Observe that in the case of a commutative C* -algebra A the quantum relative 
entropy S coincides with the classical relative entropy H defined in where 
the probabilities are defined as the expectations of minimal projectors in A . 
The functional S(-,tp) is continuous on S(A) only if the reference state tp is 
faithful, i.e. suppi/? = 1.4, otherwise it is lower semi-continuous. The relative 
entropy distance from the reference state tp to a subset ^ C S(A) is given by: 




tr^A/, (log A/, - log As), if supp(?/>) < supp(y>) 



oo, otherwise. 



(5) 



S(*,tp) := inf :Sty,<p)- 



(6) 



Now we are in the position to state our main result: 



4 



Theorem 2.2 (Quantum Sanov Theorem) Let tp G S(A) and * C S(A). 
There exists a sequence {p n }»eN of orthogonal projections p n G A^ such that 

lim V®"(Pn) = 1, G tf, (7) 

n — >oo 

lim ilog^"(p„) = - inf S(V>,y)- 

n^oo 77, 

Moreover, for each sequence of projections {p n } fulfilling we /lave 
lim inf -log 93®" > - inf SfcM, 

n— >oo 77 

smc/i that S(^/,ip) is the best achievable separation rate. 

The proof of Theorem l2 . 2l will be based to a large extent on the following classical 
lemma, which is a stronger version of Theorem 12. II 

Lemma 2.3 Let Q G V(A) and SI C V(A). For each sequence {e ra }n£N sat- 
isfying e n \ and 1 ° s ^"^ 1 - ) — > there exists a sequence {M„}„ s n of subsets 
M n G A" such that for each P G SI there is an N(P) G N with 

P n (M n ) > 1 - (n + 1)* A ■ 2- nb£ ™ , Vn>N(P), (8) 

where b is a positive number. Moreover we have: 

1. lim inf™ i log Q n (M n ) > -H{Q,Q), 

2. Q n {M n ) <(n+ l)* A ■ 2- n/ ™, Vn G N, 
where L n > and /or all n E N fulfilling e n < | 

< if (fi, Q) - 7„ < log(#A)e„ - e„ log £ logQmm, (9) 

/70/ds with Q m i n := min{Q(a) : Q(a) > 0, a G A}. 

Proof: Due to the classical Stein's lemma any sequence of subsets {M„}„ e N, 
which has asymptotically a non vanishing measure with respect to the product 
distributions P n satisfies: 

liminf -Q n (M n ) > -H(P, Q), (10) 

n— >oo 77 

where Q G V(A) is the reference distribution. Then the lower bound (|10() implies 
the first item of lemma (|2.3H . 

We partition the set SI into the set Sli consisting of probability distributions 
which are absolutely continuous w.r.t. Q and its complement SI2 within SI. i.e. 

fii := {P £ fl : H(P,Q) < 00}, and fl% := (~l f2. 
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Observe that 

H<fl,Q) = l H ^> e »> { \^ % (") 
v ' I oo, otherwise. 

holds. We will treat these two sets separately. 

It is obvious that we can ideally distinguish the distributions in f^2 from Q, we 
just have to set 

M 2 , n ■= {x n G A n : Q n (x n ) = and P n (x n ) > for some P G n 2 }. (12) 
Then we have for all n G N 

Q n (M 2 , n ) = 0. (13) 
Moreover we have for each P G fi 2 and neN 

P n (M 2> „) = 1 - g£ - 1 (n-oo), (14) 

where 

gp := P(A+), 

with 

A+:={(i6A: Q(a) > 0}. (15) 

Observe that the speed of convergence in (|14J) is exponential. 

In treating the set f2i we may consider the restricted alphabet A + defined in 
<|15|) only. Note that H(-,Q) is continuous as a functional on V(A + ). Choose a 
sequence e n \ with log ^"^ 1 " 1 ^ — ► and define the following decreasing family 
of sets 

n„ := {P G P(A+) : ||iZ - P\\x < e n for at least one PeSli}. 
Observe that fi„ \ fii. Moreover we set 

M hn := {/ G ^ : P x n e n n }, (16) 

where P x n denotes the empirical distribution or type of the sequence x n . Now, by 
type counting methods (cf. [Jj section 12.1) and Pinsker's inequality if (Pi, P2) > 
2^2 1 1 -Pi - P2II1 w e arrive at 

P"(^T,J < (n + i)#^+2-" be - -»• (n -» 00), (17) 

for each P G fii where & is a positive number and M£ denotes the complement 
of the set M n . 

The upper bounds with respect to the distribution Q are a consequence of type 
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counting methods together with the (lower semi-) continuity of the functional 
H(-, Q) combined with {TTJ| and the fact that f2 n \ Q,\\ 

Q n (M hn ) <{n+ i)#A +2 -nH(n n , Q ) ( by type countingj cf . [7j sect . 12 .i). (i 8 ) 

We set /„ := H(Q n ,Q). Observe that the sequence (I n )neN is increasing and 
In < min P6 ^j-77(P, Q), for all neN, since f2„ \ f2i. Next, observe that 

C {P G P(A+) : ||P - < £„ for at least one P e Hi}. (19) 

Let P„ € r^n be such that H(R n , Q) = min^gjj 7? (P, Q). By the continuity we 
have H(R n ,Q) = I n . According to Ijl9(l for each n G N there is a distribution 
P„ G STl such that \\R n - P n \\i < £n- Using the inequality \H(P) - H(R)\ < 
log(#A)||P-i?||i+7?(||P-i?||i) valid for distributions P, R with ||P-P||i < 5, 
where rj(t) := — tlogt, and Q m m '■= min{Q(a) : a G ^4+}, we obtain finally 

0<H(Q,Q)-I n = H(fl u Q)-I n (byCU) 

= mm H(P, Q) — I n (by continuity) 

< H(P n ,Q)-H(R n ,Q) 

= H(R n ) - H(P n ) + ORn(a) - P„(a)) log P(a) 

< l0g(#A + )||P n - P„||! +T)(\\P n - RnlU) 
-\\P n ~ R n \\l l0gQ m i„ 

< log(#A) log logQmin. 

Now, setting 

M n :=Mi in UM 2 , n , 
we see by and Ijl8|l that for all n G N we have 

Q n (M n ) < Q n (M 1>n ) + Q n (M 2 , n ) <(n + l)# A 2- nI ". 

Moreover for each P G f2 we may infer from 114JI and l|17|l that for all sufficiently 
large netl 

P"(M„) > 1 - ( n + l)* A 2- nbe -, 

holds. □ 



3 Proof of the quantum Sanov theorem 

Before we prove the quantum Sanov theorem l2.2l we cite the here relevant known 
results. We define the maximal separating exponent 

/3 e , n (^® n , p® n ) := min{log<^"(g) : q G A {n) projection, tp® n {q) > 1 - e}. 
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Proposition 3.1 Let ip,tp £ S(A) with the relative entropy Sty,(p). Then for 
every e £ (0, 1) 

lim -0 E , n ty^ n , O = -Sty, <p). (20) 

n— >oc n 

The assertion of Proposition 13.11 was shown by Ogawa and Nagaoka in ■ A 
different proof based on the approach of Hiai and Petz in ^U] was given in [5] . 
Proof of the main Theorem 12.21 

1. Proof of the lower bound: Due to the Proposition l3.il any sequence of pro- 
jections {p n }n<EN, which has asymptotically a non vanishing expectation value 
with respect to the stationary product state {ip® n } n &t satisfies: 

hminf i log > Sty, if), (21) 

n — >oc Ji 

where <p £ S(A) is a fixed reference state. 
The lower bound (|21|) implies the lower bound 

liniinf-log^ ™(p„) > -S(^>,<p) 

n — >oo n 

for any sequence {p n } n eN of orthogonal projections p n £ A^ n ' satisfying 
condition J2J) in Theorem 12. 21 

Proof of the upper bound: To obtain the upper bound 
lim sup - log ¥>® n (p n ) < -S(%<p), 

n — >oo Tl 

where tp is a fixed reference state, it is obviously sufficient to show that to each 
positive 5 there exists a sequence p n such that 

lim ^® n {p n ) = 1, W> £ % (22) 
n — >oo 

and 

lim sup- log ip® n {p n ) < -S($,<p)+d 

n — >oo "n 

is fulfilled for sufficiently large n. To show this we will apply the classical result, 
Lcmma l2.3l to states restricted to appropriate abclian subalgbras approximating 
the quasi-local algebra .4 00 . 

Consider the spectral decomposition of the density operator D v : 



d 
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where Aj are the eigen- values and a are the corresponding spectral projections. 
It follows a decomposition for D v ®i = D® 1 : 

D t= e friO®^. 

ii,...,j ( =l \j'=l ) j=l 

which leads to the spectral representation 

Dt= e (i^W...^ 

with the spectral projections 

i 

e h ,...,i d := 51 ®'< 

(ii,...,ii)e/j 1 ...i <i J=l 

where /^...j,, := {(n, #{j : ij = k} = h for fc g 

Let '0 be a state on „4 and Z e N. We denote by T>i^ the abelian subalgebra of 
A {X) generated by {ei 1 ...i d }i 1 ...i d U{e h ...i d D^ l e h ...i d }i 1 ...i d . As a finite-dimensional 
abelian algebra, it has a representation 

di 
1=1 

where {/i is a set of mutually orthogonal minimal projections in 1?;^. 
Hiai and Petz have shown that 

sa>® 1 , <p® 1 ) = s&® 1 r ©i,^, tp 91 r + o bo - (23) 

where tp \ V denotes the restriction of a state -0 & S(A) to a subalgebra V C A 
and E 1 ; is the conditional expectation with respect to the canonical trace in A"' : 

Er. > e il ... i ^«e il .. J<j , 

: = E e h ...i d ae h ...i d . 

Ji...J<i:£ 4 J<=i 

Observe that 

Sty® 1 o Et) - Sty® 1 ) <dlog(Z + l), (cf. HD|, 0) (24) 
which gives the lower bound 

r^w®' r^)>«(^,^)-rfiog(/ + i) (25) 
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implying 

lim jSty 91 \Vt^,ip & \V l!i} ) = S(iP,cp). 

Next we consider a maximal abelian refinement £>; ^ of ~Di^ in the sense of the 
algebra A"': 

di,aij 

®IM> '■= (J) C ' dLj.k, 

where gij^k are one-dimensional projections in the algebra A^ such that fij = 
©fc=i ^ ' 9l,j,k- This means that Bi^ 3 T>i^- It holds by monotonicity of the 
relative entropy and by the estimate H25|) 

> l(S{il>,<p)-m), (26) 

where we used the abbreviation rji := dlog j' +1 ) i n the last line. 

Due to the Gelfand isomorphism and the Riesz representation theorem the re- 
stricted states ip® 1 \ Bi t $ and ip® 1 \ Bi t $ can be identified with probability 
measures P and Q on the compact maximal ideal space B\^ corresponding to 
the d l -dimensional abelian algebra Bi^. The relative entropy of P with respect 
to Q is determined by: 

H(P, Q) = S(iP® 1 \ B l;i> , p® 1 \ Bw) > I ■ {S(iP, <p) - m). 

Similarly, the states ip® nl \ sfe and p® nl \ gj"| correspond to the product 
measures P n and Q n on the product space Bf^. 
We define 

S t := inf{5(V®' r Bi^tp 91 \ B w ) : i/j G *} 

and fix an -00 £ For any ^ £ ^ and each Z G N there exists a unitary 
operator U$ G .A™ that transforms the minimal projections spanning Bi^, into 
the minimal projections of Bi^ and that leaves the spectral subspaces of D^i 
invariant. Let us denote by ip) the set of unitaries having these properties. 

To each denote by ip^ the state on A^ with density operator U^D^ l U^. 

Then we have 

\Bw,tp® 1 \ Bw) = \B^ ,<p® 1 \B l4 , Q ). 

Let Hi be the set of probability measures on Bi^ corresponding to all ip^ \ 
Bi^ , where ip G ^ . Further let the measure Q on Bi^ correspond to the 
restricted reference state p® 1 \ Bi^ . Then 

H(ili,Q)>Si>l-(S(%ip)-m), (27) 
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where the second inequality follows from l|2tj[) . Due to the Lemma 12.31 there 
exists a sequence {M n } n& ^ of subsets M n e -B"^ o (cf. (fTHjh) such that 

lim P n (M n ) = 1, VP e fi ; (28) 

n — >oo 

and for every n G N 

Q n (M n ) < (n + i f 2- nI " ( - l \ 

where I n (l) / H(ili, Q) for n — > oo. Moreover, we know that /„(/) > H(fli, Q) — 
e n (\ogd l - loge n - logQ min (/)), where Q min (l) := min{Q(a) : a £ 5z,</, }. We 
introduce the abbreviation A„:= e^logd' — loge„ — log<5min(0)- ^ holds: 

ilogQ«(A/„) < rf ' l0S(n+1) - /„ < d ' l0g(n+1) - A«). (29) 

n n n 

To each M„ e there corresponds a projection in Bf2 C „4( nZ ) . For an 

arbitrary m e N such that m = nl + r G N with r S {0, ...,(- 1} we define a 
projection p m <E A^ m ' by 

where l[ n i+i jTI i+ r ] denotes the indcntity in the local algebra -A[ni+i.„i+ r ] ■ It holds 
j> mn (p n t) = P n (M n ) 7 Vl/> e * 

and 

-\og^ m ( Pm ) < -\ g^ n \p nl ) = -AogQ n {M n ). 
m nl nl 



Using (J2SJ|, (129 and J23 we conclude 



lim^ , (C/!%it/f) = l I W>G* (30) 



and 



1 log^ m ( Pm ) = i-logQ"(M„) 



nl 
jI 



& log(n + 1) 1 

< £15^±J1- 1^, Q)-A«) 

< ^±^_5(, |V) + W + ^. (31) 



For fixed leNwe construct for each n S N the projection: 
Pm := V U*® n PmU® n . 

f76ili(*,¥>) 
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For an arbitrary number m = nl + r, r G {0, . . . , I — 1}, wc define 

Pm : = Pnl ® Mnt+^nl+r]- 

It follows for arbitrary ?/i G $ and each m = nl + r G N: 

^ m {p m ) = i>® nl (p n i) > ip® nl (U;® n Pnl U® n ). (32) 
Using the estimate (|30[) we obtain the general statement: 
lim V 0m (p m ) = 1, W> G 



Next we consider the expectation values tp® nl (U*® n p n iU® n ) for any U G 
il/^, </?) and n G N. From the assumed invariance of D v ®i with respect to 
the unitary transformations given by elements of it;^, we conclude 

v ,^U* 9n pmU 9n ) = v® nl ( P ni), W g <p). (33) 

The dimension of the symmetric subspace 

SYM(^ (z) ,n) := span{A®" : A G A {1) } 

is upper bounded by (n + i)" 11 ™- 4 ''^ which leads to the estimate 

trp n l < {n + lf l -tl Pni. (34) 
Using (GEJ and lO we obtain 

llog^ m (p„J < llog^"^) 
m nl 



< J. l g((n + l) d2 ' -^(p„0) 
ru 



(,f' + <i -)lo 8 (n + l) _ < + ^ 

For fixed / the upper bound above converges to — S(^, tp) + r/i, for n — » oo. 
Choosing Z sufficiently large, 77; becomes smaller than 5. This proves the upper 
bound. □ 



4 Two examples 

1. Consider a quantum system where C 2 is the underlying Hilbert space and let 
v, w be two different non-orthogonal unit vectors in C 2 . Let tp® n be the product 
state on (<8(C 2 ))®" with the density operator p® n , where p w is the projection 
onto the one-dimensional subspace in C 2 spanned by w. Further let S > 
and denote by ips the state on 23 (C 2 ) corresponding to the density operator 
(1 — 5)p v + 5p w . It seems rather clear that any reasonable attempt to define 
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empirical distributions (states) in quantum context should choose p w in the case 
of ij)® n (or more general the underlying one-site state in the case of a stationary 
product state). So, when trying to define typical projectors via empirical states 
and to use these in analogy to the classical Sanov's theorem, the n-block typical 
projector p( n *> for the set = {i/j} C S(Q3(C 2 )) would be expected to fulfil p^ > 
pg n . Then we have >pf n {p (n) ) > <pf n {p® n ) = (<* + (1 - S)(v, w) 2 ) n > (v,w) 2n . 
On the other hand, the relative entropy of the density operator p w with repect 
to (1 — S)p v + 5p w can be made arbitrary large by choosing 6 small but positive. 
This shows that, in contrast to the classical situation, when relying on empirical 
states the relative entropy rate is not an accessible separation rate (which can 
be at most — 2 log | (v, w)\) . We might simplify the argument by saying that 
though the relative entropy of ip with respect to tp is infinite the separation rate 
using empirical distributions remains bounded. But choosing p( n > as pr v % n }±, 
where (v®™) 1 - denotes the orthogonal complement of the vector v® n in (C 2 )®" 
yields <p® n (p(™)) = and ij)® n {p <JL ^) — > 1, hence the separation rate can in fact 
be made infinite when choosing the typical projector in another way. 

2. A slightly more involved example shows that, again in contrast to the classical 
case, there is in general no universal choice of the separating projector, i.e. it 
has to depend upon the reference state (p. This time we will refer directly to 
the infinite relative entropy case and leave the simple 'smoothening' argument 
which leads to a finite entropy example to the reader. Let v and w be two 
orthogonal unit vectors in C 2 . Let $ be the set of pure states ipt on 05 (C 2 ) 
corresponding to the vectors v t := cost ■ v + sini ■ w, t 6 [—T,T], | > T > 
and = {i/j}, where ip is the pure state corresponding to w. Assume there is a 
typical projector for ip® n separating it from each ipf n super-exponentially 
fast. This should be valid for an universal projector since all the relative 
entropies S(ip,<pt) are infinite. Let SYM(n) C (C 2 )®" be the symmetrical 
ra-fold tensor product of C 2 . Without any loss of the generality we may choose 
P {n) < PsYM(n) since all vf n as well as w® n belong to SYM(n). Observe 
that the existence of p^ (with the desired property) implies the existence 
of at least one (sequence of) unit vectors x n in SYM(n) such that (x n ,vf n ) 
tends to zero super-exponentially fast uniformly in t. Choose an orthonormal 

basis in SYM(n) by e n>k := (l) 1/2 /nl E^permW U ^ W ® k ® ^ (n ~ fe) )> where 
PERM(n) is the group of n-Pcrmutations and U„ is the unitary operator which 
interchanges the order in the tensor product according to ir. Representing vf n 

1 /2 

in that basis yields the numerical vector ((^) (sint) fe (cost)™~ fe )^ =0 . So the 
question is whether there exists a sequence of unit vectors x n = (x n .k) such 

1 /2 

that sup tg [_ T T ] (cos t) n J2 k x n ^ (k) (t an t) k tends to zero super-exponentially 
fast. Observe that the factor (cost)™ is bounded from below by (cosT)™ 
and can be omitted since it goes to zero only exponentially fast. Moreover, 
if we replace x n by x n = (x n ^(^) ) we change its norm only by an at 
most exponentially smaller factor (the maximum of binomial coefficient is of 
exponential order 2"). So we may simplify the problem by asking whether there 
is a sequence of unit vectors x n which has a super-exponentially decreasing 
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inner product with the numerical vectors ((tant) )fc=o i n , uniformly in 
t G [—T,T]. This can be excluded: Let n be uneven and consider the set of val- 
ues t, n = arctan((l — 2—) -tanT), rn = 0, 1, ...,n. Even for this finite set of values 
we have necessarily sup m J2 k x nM (tant m ) k = sup m J2k x n,ki( l ~ 2f )) fe (tanT) fc 
tending to zero at most exponentially fast. In fact, the factor (tanT) fc can be 
omitted as before. Let V n be the Vandermonde matrix (((1 — 2^)) fc )" 1 fc=0 . 
Then the Loo-norm of the vector V n x n can be estimated by a sub-exponential 
factor times its L2-norm, and by Example 6.1 the least singular value of V„ 
behaves like ne^ e~ n ^ + ^ in2 \ 
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